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1 .  INTRODUCTION 


A  possible  indication  of  the  existence  of  global  climate  warming  is  the  presence  of  a  trend 
in  the  travel  time  of  an  acoustic  signal  along  several  ocean  paths  over  a  period  of  many 
years.  This  report  presents  improved  techniques  for  testing  for  trend  in  such  time  series 
data.  For  background  on  the  use  of  statistical  time  series  and  trend  extraction  methods  for 
ocean  acoustic  global  warming  studies  the  reader  is  referred  to  our  previous  report 
[Bottone,  Gray,  and  Woodward,  1995]. 

The  specific  problem  we  address  in  this  report  is  that  of  testing  for  trend  using  the  model 

Xt  —  ci  +  bt  +  Et ,  ( 1 ) 

where  Et  may  be  highly  correlated  stationary  noise.  In  the  ocean  acoustics  problem,  Xt 
represents  the  acoustic  travel  time  along  a  fixed  path  at  time  t,  however,  all  of  the 
techniques  discussed  in  this  report  apply  to  any  time  series  which  can  be  represented  by  the 
model  given  by  equation  (1).  Testing  for  trend  in  a  given  set  of  time  series  data  assumed 
to  be  modeled  by  equation  (1)  amounts  to  testing  the  hypothesis  b  =  0 .  If  the  value  of  b 
estimated  from  the  data  is  significantly  (at  some  given  significance  level)  different  from 
zero,  a  trend  is  said  to  exist  in  the  data.  If  the  distribution  of  the  noise,  Et,  is  completely 

known,  this  problem  is  relatively  simple.  If,  however,  as  is  usually  the  case  in  practice, 
the  distribution  of  Et,  or,  at  least,  its  parameters,  must  the  estimated  from  the  data,  the 
problem  is  somewhat  more  difficult.  In  particular,  when  the  correlation  in  the  E,  is  high, 
as  appears  to  be  the  case  with  ocean  acoustic  data,  the  problem  becomes  extremely  difficult 
due  to  the  difficulty  in  obtaining  reasonable  parameter  estimates  without  exceedingly  long 
samples  lengths. 

Previously  existing  methodology  for  testing  for  trend  in  the  model  given  by  equation  (1)  is 
generally  invalid  when  the  correlation  in  the  Et  is  high,  unless  the  sample  size  is 

unacceptably  large.  These  methods  are  unable  to  differentiate  between  trends  which  are 
due  to  b  *  0  and  “temporary  trends”  that  are  due  to  high  correlation  in  Er  This  inability  to 

differentiate  between  a  true  trend  and  high  correlation  manifests  itself  in  inflated 
significance  levels.  That  is,  when  b  is  equal  to  zero  in  equation  (1)  and  the  Et  are  highly 

correlated  and  existing  methods  are  used  to  compute  the  significance  level  using  a  nominal 
5%  level,  the  percentage  of  realizations  from  (1)  for  which  a  nonzero  b  is  (incorrectly) 
detected  can  be  as  high  as  25%-50%  for  all  existing  tests.  The  impact  is  serious  since 
Woodward  and  Gray  [1993]  show  for  atmospheric  temperature  data  that  when  existing 
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methods  for  testing  b  =  0  in  equation  (1)  are  applied  to  the  data  one  concludes  that  b  is 
significantly  greater  than  zero  and,  hence,  warming  is  present.  On  the  other  hand,  they 
also  model  the  data  as  an  ARIMA(  p,  1,#),  which  is  a  plausible  correlation  model  ,  and  the 
inference  is  the  opposite,  i.e.,  the  current  warming  trend  is  temporary  and  should  not  be 
predicted  to  continue.  The  source  of  the  conflicting  results  can  be  traced  to  the  invalidity  of 
existing  tests  for  b-  0  in  this  setting. 

As  a  result  of  this  difficulty,  Woodward  and  Gray  stressed  the  importance  of  selecting  the 
statistical  model  used  to  represent  the  data  with  care.  Consequently,  they  developed  a 
method  to  let  the  data  select  the  model  [Woodward  and  Gray,  1995].  In  that  paper,  their 
procedure  selected  the  “correlation  model”  as  appropriate  for  the  atmospheric  temperature 
data,  which  implied  that  warming  should  not  be  predicted  to  continue.  This  method  was 
also  applied  to  simulated  ocean  acoustic  travel  time  data  in  our  previous  report,  where  it 
was  shown  that  it  would  take  well  over  20  years  to  reliably  distinguish  whether  such  data 
was  best  classified  as  coming  from  a  line  plus  noise  model  as  in  equation  (1)  or  a 
correlation  model  (ARIMA). 

What  is  needed,  however,  is  a  more  robust  test  for  trend  in  equation  (1)  than  currently 
exists.  That  is,  we  need  a  test  for  b  =  0  that  is  valid  for  much  higher  correlated  data  than 
previous  tests  allow.  By  developing  such  a  test  we  will  have  a  model  which  is  more  robust 
to  high  correlation  and,  hence,  should  be  more  compatible  with  correlation  models.  It  will 
not  be  possible  to  make  such  models  completely  compatible  since  the  assumption  that  the 
E,  in  equation  (1)  is  stationary  may  not  be  reasonable.  However,  it  is  possible  to 

dramatically  improve  the  test  for  b  =  0  so  that  it  is  compatible  with  correlation  models  in 
most  cases.  In  our  current  work,  described  in  this  report,  we  have  developed  a  new  test 
for  testing  b  =  0  in  equation  (1)  that  is  valid  for  correlations  as  high  as  .95.  That  is,  the 
new  test  appropriately  maintains  the  specified  significance  level  while  continuing  to  have 
good  detection  capability  when  a  trend  is  actually  present.  It  should  be  remarked  that 
although  equation  (1)  models  a  linear  trend,  the  test  is  sensitive  to  detecting  any  general 
increase,  or  decrease,  in  the  data.  It  is  interesting  to  note  that  when  this  new  test  is  applied 
to  the  atmospheric  temperature  data  we  get  agreement  with  the  previous  result  of 
Woodward  and  Gray,  that  is,  the  new  test  determines  that  b  is  not  significantly  different 
from  zero,  which  is  compatible  with  the  classification  of  the  data  as  more  likely  coming 
from  a  correlation  model  that  would  predict  no  warming.  This  new  method  is  described  in 
some  detail  in  section  2,  with  some  of  the  more  difficult  technical  discussion  found  in 
appendix  A,  which  is  a  copy  of  a  paper  on  this  topic  by  the  authors  submitted  for 
publication  in  the  Journal  of  Time  Series  Analysis. 


This  new  method  for  testing  for  trend  has  been  developed  in  such  a  way  that  it  can  be 
directly  extended  to  the  multivariate,  or  vector,  case.  In  the  ocean  acoustics  problem,  this 
corresponds  to  having  a  set  of  data  consisting  of  travel  times  on  several  paths  and  testing 
for  trend  on  all  of  the  paths  simultaneously.  This  generalization  will  be  presented  in  detail 
in  section  3. 

Appendix  B  contains  a  description  of  the  TRENDS  software,  which  allows  the  user  to 
perform  the  new  test  for  trend  on  a  selected  set  of  time  series  data.  It  also  contains  routines 
which  allow  the  data  to  select  between  the  model  of  equation  (1)  or  an  ARIMA(p,l,0) 
correlation  model.  A  given  set  of  data  can  be  modeled  in  various  ways,  the  noise  structure 
can  be  approximated,  and  the  power  of  the  test  for  trend  can  be  computed.  Questions  such 
as  how  long  will  it  take  to  detect  a  trend  in  data  similar  to  a  given  data  set  and  how  large 
would  the  trend  have  to  be  in  a  given  data  set  to  be  significant  can  be  answered  with  simple 
applications  of  the  software. 


2 .  TESTING  FOR  TREND  IN  UNIVARIATE  TIME  SERIES  DATA 


In  this  section  we  develop  a  new  method  to  test  the  hypothesis  Ho:  b  =  0  in  equation  (1) 
against  either  the  two-sided  alternative  Hj:  b  *■  0  or  the  one-sided  alternative  H] :  b  <  0  (or 
Hj :  b  >  0).  The  one-sided  alternative  b  <  0  is  more  appropriate  to  the  ocean  acoustics 
problem  since  the  travel  time  of  any  acoustic  signal  is  expected  to  decrease  with  time  on 
most  paths  if  there  is  warming.  We  will  concentrate  here,  however,  on  the  two-sided 
alternative,  b  *  0 ,  since  it  is  most  readily  generalizable  to  the  multivariate  case. 

In  equation  (1)  X,  is  assumed  to  be  a  random  variable  given  by  the  discrete  stochastic 
process  {Xt;t  =  0,±1,...}.  In  the  ocean  acoustics  problem  X,  represents  travel  time  as  a 
function  of  time,  t.  A  realization  of  length  n  of  the  time  series,  Xt,  is  a  set  of  real-valued 
outcomes  which  will  be  denoted  {xr;  t  =  1, Loosely  speaking,  a  set  of  data,  xt,  will 
be  considered  a  realization  from  the  time  series,  Xr  The  noise  component  in  equation  (1), 
Et,  will  be  assumed  to  be  a  stationary  autoregressive  (AR)  process  of  order  p  satisfying 


[Box  and  Jenkins,  1976;  Gray  et  al.,  1996] 

<P(B)Et  =  at,  (2) 

where  B  is  the  backward  shift  operator  given  by  BkXt  =  Xt_k, 

<p(B)  =  \-<pxB -<p2B2 - <PpBp ,  (3) 


2 

and  at  is  discrete  white  (Gaussian)  noise  with  zero  mean  and  variance  (Ja ,  i.e.,  E[at ]  =  0 
and  £[af2]  =  cr2.  To  motivate  the  new  method,  we  first  discuss  testing  for  trend  under  the 

assumption  that  the  noise  in  equation  (1)  is  white,  i.e.,  (j)(B )  =  1. 

2.1.  Ordinary  Linear  Regression 

Given  a  set  of  time  series  data,  {*,;  t  =  l,...,n},  we  wish  to  test  the  hypothesis  b  =  0  in 
equation  (1)  under  the  assumption  that  E,=ar  If  the  hypothesis  is  rejected  at  an 
appropriate  significance  level  (usually  5%),  then  it  is  generally  accepted  that  a  trend  is 
present.  In  ordinary  linear  regression  ,  the  least  squares  estimators  for  b  and  a  in 
equation  (1)  are 


(4) 


where 


_  ]^\ _ 

n  ’ 

2('-02 

f=l 

a  =  X -bt , 


1  n 

X  =  -X*„ 

nM 


1  A  n  + 1 

/=-E'  =  ^r- 


(5) 


(6) 

(7) 


Sample  estimates  of  these  quantities  are  obtained  by  substituting  the  sample  data  xt  into  the 


appropriate  equation.  Under  the  usual  regression  assumptions  that  the  residuals  are 
independent  and  normally  distributed  with  mean  zero  and  variance  <7 2,  the  estimated 


standard  error  of  b  is  given  by 


SE(b)  = 


12E(Xf  —a  —  bty 
r=i _ 

(n -  2)n(n2  - 1) 


1/2 


(8) 


Under  these  assumptions,  the  test  statistic  t  =  b  /  SE(b)  is  distributed  as  Student’s  t  with 
n  -  2  degrees  of  freedom  when  the  null  hypothesis  Ho:  b  =  0  is  true.  For  a  given 
realization,  ifl  can  then  be  compared  with  the  2.5%  critical  value  of  the  Student’s  t 
distribution  with  n  —  2  degrees  of  freedom,  for  the  two-sided  test.  The  null  hypothesis  is 
rejected,  and  the  trend  is  said  to  be  significant  (at  the  5%  level),  if  if l>  t  -  2),  which 
is  the  critical  region  for  the  test.  ( t915(n  -  2)  is  equal  to  2.01  for  n  =  50,  1 .98  for  n  =  100 

and  asymptotically  equal  to  1.96  for  large  n,  i.e.,  it  is  asymptotically  normal).  For  the 
one-sided  (negative)  test,  the  trend  will  be  significant  (at  the  5%  level)  if  t  <  -t95(n  -  2). 

A  thorough  discussion  of  linear  regression  theory  can  be  found  in  many  textbooks  on 
mathematical  statistics,  such  as  Robinson  and  Silva  [1979]. 
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2.2.  A  New  Testing  Procedure 


In  this  subsection  we  outline  the  new  method  to  test  Ho:  6  =  0  in  equation  (1).  Full  details 
can  be  found  in  appendix  A.  Notice  that  if  0(B)  in  (1)  were  known,  then  we  could 

rewrite  (1)  as 


0(B)X,  =  0(l)a  +  lift  6  +  0(l)6r  +  0(B)£, 

\i=l  J 

—  c  +  dt  +  at , 


(9) 


where  c  =  0(l)a  + 


p 

Vi=i  y 


6,  d  =  0(1)6  and  a,  is  white  noise. 


If  E,  is  stationary,  which 


implies  0(1)  >  0,  then  d  =  0  if  and  only  if  b  =  0,  and d  and  b  have  the  same  sign  when 
6*0.  To  test  6  =  0  in  equation  (1)  we  simply  test  d  =  0  in  equation  (9),  in  which  case 
we  are  able  to  use  the  usual  regression-based  standard  errors  as  given  in  section  3.1  since 
the  residuals  are  white. 


In  practice  0(B)  is  not  known  and  must  be  estimated  from  the  sample  data.  To  estimate 

* 

0(B),  we  subtract  the  least-squares  estimates  of  a  and  6  (denoted  a  and  6)  from  the  data 
to  obtain  the  residuals 

Et  =Xt  —  a  —  bt.  (10) 


These  residuals  do  not  follow  the  same  model  as  Et  and  in  general  are  not  stationary  since 


Ef  =  u  +  bt  +  Ef  —  ci  bt 
=  (a  —  d)  +  (b  —  b)t  +  Et , 


(ID 


which  does  not  have  constant  mean  unless  6  =  6.  However,  in  most  cases  we  find  it 

A 

reasonable  to  assume  these  residuals  are  approximately  AR(  p),  and  we  let  0(B)  denote  the 
estimated  autoregressive  operator.  We  transform  the  data  using  0(B)  to  obtain 


Wt  =  0(B)Xf 


=  0(l)fl- 


S«0/  6  +  0(l)6r  +  g, 


(12) 


where  c'  =  0(l)a  + 


(p  -A 

l"t>i 

v<= i  y 


5,  d'  =  0(l)5,  and  g,  =  0(5)0  l(B)at,  which  will  not  be 


white  noise  but  should  be  a  reasonably  close  approximation  to  it. 


A  straightforward  application  of  the  procedure  (assuming  g,  is  white)  is  to  use  standard 

regression  procedures  to  test  for  the  significance  of  d' ,  which  should  be  a  good  estimate  of 
d  if  gt  is  close  to  being  white.  This  estimation  procedure  is  summarized  as  follows: 


1 .  Estimate  a  and  5  using  least  squares. 

2.  Calculate  E,  as  in  (3). 

A  /V 

3.  Find  Burg  estimates  of  0(5)  where  0(5)5,  =  a,.  Call  this  estimate  0(5). 

4.  Transform  the  data  to  obtain  0(5)X,  =  c'  +  d't  +  g,  where  g,  is  nearly  white. 

A  ^  /V  A  A 

5 .  Calculate  t  =d'  /  SE(d') ,  where  d'  and  its  standard  error  are  the  usual  least 
squares-based  quantities  assuming  uncorrelated  residuals.  Compare  t  with 

A  * 

t(n-  p- 2)  critical  values  based  on  Student’s  t  since  0(5)X,  is  of  length 
n-p. 

A 

As  is  shown  in  appendix  A,  because  of  bias  in  the  estimate  0(5),  the  distribution  of  the 
test  statistic,  t ,  defined  above,  is  not  close  to  Student’s  t  distribution  when  the  residuals 
are  highly  correlated,  i.e.,  0(1)  =  0,  and  the  series  length  is  small  to  moderate,  n  ~  100. 
For  example,  for  the  model  in  equation  (1)  with  0(5)  =  1-.955  and  5  =  0,  this  test  had  a 
significance  level  of  approximately  25%  for  n  =  100  using  the  usual  critical  regions  based 
on  Student’s  t,  instead  of  the  nominal  5%.  We  see  that  this  procedure  suffers  from  the 
same  problem  of  excessive  actual  significance  levels  that  occurs  with  all  existing  tests. 


2.2.1.  A  bootstrap  approach 


This  deficiency  in  maintaining  the  true  significance  level  when  5  =  0  can  be  remedied  by 
finding  the  true  distribution  and  critical  regions  of  the  test  statistic  t ,  which  we  propose  to 
estimate  by  employing  a  bootstrap  procedure.  The  advantages  of  using  the  test  statistic  t 
defined  above  is  that  it  is  quickly  calculated  so  a  bootstrap  will  not  consume  a  prohibitive 
amount  of  computer  time  and  it  is  readily  generalizable  to  the  multivariate  case,  as  will  be 
seen  in  section  3.  In  the  bootstrap  procedure  we  obtain  t  for  a  sample  data  set  using  steps 
1-5  above.  To  simulate  realizations  under  Hq  we  estimate  0(5)  in  (1)  assuming  5  =  0  and 


denote  the  estimate  by  0(O)(5).  That  is,  under  H0  we  assume  that  any  trending  behavior  in 
the  series  is  due  to  correlation  structure  alone.  We  then  obtain  by  simulation  B  realizations 
from  the  autoregressive  model  with  AR  operator  given  by  0  ( B ).  For  the  bt h 
realization,  b  =  l,...,B,  we  calculate  t b  as  in  step  5.  For  the  two-sided  test,  the  null 
hypothesis  is  rejected  at  the  cc  level  of  significance  if  t  >  q_a/2  or  t  <  ta/ 2  where  tp  is  the 

fith  empirical  quantile  of  jf/J  J  .  Because  of  the  symmetric  nature  of  t ,  in  practice  we 
accomplish  this  test  by  rejecting  Ho  if  \t\>\t\x_a,  where  ld|_a  is  the  (1  —  «)th  empirical 
quantile  of  {if^l}^  Since  the  probability  that  a  randomly  selected  member  from  the 
population  is  greater  than  or  equal  to  the  jth  largest  value  is  j  /  (B  + 1),  then  by  setting 
a  =  j /(B+ 1)  it  follows  that  lfl*_a  is  the  y'th  largest  value  of  ,  e.g.,  if  a  =  0.05 

and  B  =  399  then  l/l*_a  is  the  20th  largest  value  of  [\K%=]  •  For  a  one-sided  test,  the  a- 
level  critical  value  is  the  (1  -  a  jth  or  ath  empirical  quantile  of  yb  J  depending  on 
whether  the  alternative  is  Hi :  b  >  0  or  Hj:  b  <  0,  respectively. 

2.2.2.  A  second  application  of  the  bootstrap 

As  is  shown  in  appendix  A,  if  0(  1)  is  near  zero,  that  is,  there  is  a  root  of  the  characteristic 
polynomial  close  to  unity,  the  significance  levels  are  still  high  after  the  bootstrap  has  been 
used  to  approximate  the  critical  region.  This  phenomenon  is  caused  by  the  bias  in 
estimating  <j)(B).  Appendix  A  describes  a  procedure,  which  relies  on  a  second  application 
of  the  bootstrap,  which  adjusts  the  test  statistic,  t ,  by  a  factor,  C,  which  is  less  than  one 
when  0(1)  is  near  zero,  to  yield  an  adjusted  test  statistic,  fa(j j  =  Ct .  This  adjusted  test 

statistic  is  then  compared  to  the  critical  values  of  the  distributions,  such  as 

described  above. 

2.3.  Results 

Section  3  of  appendix  A  shows  in  great  detail  the  results  of  simulation  studies  designed  to 
examine  the  performance  of  the  new  testing  procedures.  There  it  is  shown  (see  table  3  in 
appendix  A),  that  for  a  variety  of  noise  models  with  high  correlation,  the  observed 
significance  levels  are  near  the  nominal  5%  even  for  series  lengths  as  small  as  n  =  50.  It  is 
also  shown  there  that  the  new  tests  also  have  substantial  power  in  detecting  trends  in  the 
time  series  studied. 


2.3.1.  Analysis  of  time  series  data  from  the  MASIG  model 

As  an  example  of  the  use  of  the  new  testing  procedures,  we  analyze  a  time  series  produced 
by  the  MASIG  model.  The  MASIG  (Mesoscale  Air-Sea  Interaction  Group)  model  is  a 
reduced  gravity  ocean  model  driven  by  COADS  (Comprehensive  Ocean- Atmosphere  Data 
Set)  winds,  coupled  to  an  equatorial  model  at  its  southern  boundary  [Pares-Sierra  and 
O'Brien,  1989].  The  acoustic  travel-time  anomaly  for  a  path  from  Hawaii  to  San  Diego  for 
a  20  year  period  is  plotted  in  figure  2-1.  The  time  axis  is  given  in  years,  with  data  plotted 
every  month,  giving  240  points  in  the  time  series.  As  can  be  seen  in  the  figure,  the  travel¬ 
time  anomalies  are  between  ±  2  seconds.  This  time  series,  which  represents  the  model 
years  1970-1990,  has  no  trend  (the  slope  of  the  best  fit  straight  line  is  close  to  zero).  If 
there  were  warming  occurring  during  this  period  the  time  series  would  look  similar  to  that 
shown  in  figure  2-1,  except  that  there  would  be  added  to  it  a  warming  trend.  In  this 
context,  a  warming  trend  would  be  given  by  a  negative  slope  (the  best  fit  straight  line 
through  the  data  would  have  a  negative  slope).  A  slope  of  -0. 10  seconds/year,  a  change  of 
-2  seconds  over  20  years,  would  correspond  to  an  approximate  increase  in  temperature 
along  the  path  of  0.01  degrees  Celsius  per  year,  or  0.2°  C  increase  over  20  years  (a  slope 
of  -0.05  sec/yr  would  give  half  these  values). 


Figure  2-1.  Time  series  for  acoustic  travel-time  anomaly  from  the  MASIG 
model. 


We  modeled  the  last  160  points  of  the  MASIG  data  as  an  AR(10).  This  model  satisfies 


(l  -  2. 17652?  + 1. 6926B2  9443B3 +.  773054  6793 B5+.  4348  B6 
-.  0608 B7+.  23  69B8  4745 B9  +.  2024 B1 0  j  Et  =  at, 


(13) 


with  white  noise  variance  tr2  =  .001196.  Our  simulations  consisted  of  generating 
realizations  from  the  model  given  by  equation  (1)  with  Et  given  by  equation  (13)  with 
two  warming  scenarios:  lines  with  slopes  of  b  =  -0.05  and  b  =  -0.10  seconds/year.  In 
table  2-1  we  give  results  using  the  bootstrap  test  procedure  (labeled  TB  in  the  table)  and 
the  adjusted  bootstrap  test  procedure  (TBA)  assuming  5  to  30  years  of  monthly  data.  The 
table  also  contains  the  case  of  no  warming,  b  =  0 .  There  it  can  be  seen  that  when  b  =  0 
both  tests  produce  actual  significance  levels  which  do  not  differ  significantly  from  the 
nominal  5%  level,  with  the  only  exception  being  TB  at  5  years. 


Significance  Power 

Level _ -0.05  s/y  slope  -0.10  s/y  slope 


Years 

TB 

TBA 

TB 

TBA 

TB 

TBA 

5 

7.5 

4.3 

9.5 

5.3 

9.8 

4.9 

10 

6.0 

4.3 

10.5 

7.5 

14.4 

8.7 

15 

5.3 

5.1 

13.7 

11.6 

26.8 

22.7 

20 

4.6 

4.2 

21.7 

18.4 

45.8 

39.0 

25 

5.2 

5.0 

31.8 

28.6 

70.5 

62.0 

30 

6.0 

5.7 

49.5 

45.7 

88.0 

83.7 

Table  2-1.  Observed  significance  levels  and  powers  for 
AR(10)  model  fit  to  monthly  MASIG  data. 


The  power  in  table  2-1  is  computed  by  generating  1000  realizations  from  the  model  of 
equation  (1)  with  noise  from  equation  (13)  and  the  number  of  bootstraps  replications 
B  =  399.  The  power  is  estimated  by  the  percentage  of  realizations  which  have  significant 
slope  at  the  5%  nominal  level  using  the  one-sided  tests  describes  above.  As  can  be  seen  in 
the  table,  the  TB  test  has  more  power  than  the  TBA  test,  which  one  would  be  justified  in 
using  for  those  cases  where  the  significance  level  is  near  the  5%  nominal  level,  which  is 


usually  the  case  here  (except,  perhaps,  for  5  years).  Even  using  the  TB  test,  for  warming 
to  be  detected  at  least  50%  of  the  time  in  the  simulations,  over  20  years  of  data  would  be 
needed  if  the  slope  is  -0.10  seconds/year  and  at  least  30  years  if  the  slope  is  -0.05 
seconds/year. 


3 .  TESTING  FOR  TREND  IN  MULTIVARIATE  TIME  SERIES  DATA 


In  this  section  we  address  the  problem  of  testing  for  trend  in  the  multivariate,  or  vector, 
model 


Xf  =  a  +  br  +  E„  (14) 

where  the  vector  random  variable  E,  =  [En,Et2,...,E,m]'  may  be  highly  correlated  (in  time) 
stationary  noise.  In  the  ocean  acoustics  case,  Xt  =[Xti,Xt2,...,Xtm]' ,  where  Xti 

represents  the  travel  time  on  path  i,  i  =  (the  total  number  of  paths  is  m ),  at  time  t. 

A  sample  set  of  data  can  be  arranged  in  an  nxm  data  matrix,  X,  where  {X}„-  =  xti,  and 
xti  is  the  measured  travel  time  on  path  i,  i  =  l,...,m,  at  time  t,  t  =  l,...,n.  For  a  given 
data  set,  X ,  we  wish  to  test  the  hypothesis  Ho:  b  =  0  against  the  alternative  hypothesis 
Hj:  b  *  0 .  Other  alternative  hypotheses  will  not  be  dealt  with  here  for  two  reasons:  1)  it  is 
not  clear  in  the  ocean  acoustics  problem  what  the  alternative  hypothesis  should  be  since  it  is 
expected  under  the  global  warming  hypothesis  that  some  paths  will  warm  (decrease  in 
travel  time)  while  others  may  cool  (increase  in  travel  time)  and  2)  the  mathematics 
necessary  to  treat  other  alternative  hypotheses  is  quite  formidable  and  beyond  the  approach 
used  here. 

We  assume  that  the  noise  in  equation  (14),  Ef,  is  given  by  a  multivariate  autoregressive 
process  of  order  p  satisfying 

E,  =<D!E,_,  +<D2Ef_2+...+<I>pE,_p+U„  (15) 

where  Oj,  <t>2,...,  are  real  mxm  matrices  and  U,  is  a  multivariate  white  noise 
vector  such  that  E[U,]  =  0,  E[UfU£]  =  X,  and  £'[UrU^+/fe]  =  0,  k*  0.  To  motivate  the 
generalization  of  the  new  method  for  testing  for  trend  from  the  univariate  to  the  multivariate 
case,  we  first  discuss  testing  for  trend  in  the  multivariate  setting  under  the  assumption  that 
the  noise  in  equation  (14)  is  white,  i.e.,  E,  =  U,. 

3.1.  Ordinary  Multivariate  Linear  Regression 

Let  us  rewrite  equation  (14)  for  n  data  vectors  as 

X  =  HB  +  E,  (16) 


where  X  is  the  n  x  m  data  matrix  defined  above 


"*11 

x,2  • 

"  *,m 

• 

x  = 

X21 

*22  • 

-  *2m 

*n2  • 

y 

/vnm 

H,  the  “design  matrix”,  is  an  n  x  2  matrix  given  by 


B  is  a  2  x  m  matrix  defined  by 


and  the  error  matrix,  n  x  m,  is  given  by 

~En 

£21 
E=  21 


a\  a2 

•••  a, 

bx  b2 

-  b, 

E\2  •" 

Elm 

E22  •“ 

E2m 

En2 

E 

V nm 

The  assumption  in  ordinary  linear  regression  is  that  the  m  observations  at  time  t  have 

covariance  matrix  2,  but  observations  from  different  times  are  uncorrelated,  i.e., 
E[EtiEsj]  =  <5„2y,  ij  =  1  t,s  =  1 ,...,«,  where  is  the  Kronecker  delta. 

A 

The  least-squares  estimate  of  B,  denoted  B,  is  found  by  minimizing 
tr[(X  -  HB)'(X  -  HB)].  The  result  is  [Johnson  and  Wichem,  1988] 

B  =  (H'H)-1H'X.  (21) 

A  A 

It  can  be  shown  that  B  is  an  unbiased  estimator  for  B,  i.e.,  £[B]  =  B  and  that  the 

/V 

covariance  of  B  is  given  by  [Anderson,  1984] 

cov[B]  =  2  ®  (H'H)-1 ,  (22) 

1  _  |  A 

where  20  (H'H)-  is  the  direct  product  of  the  matrices  2  and  (H'H)  .  Focusing  on  b, 
the  second  row  of  B,  we  have 
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cov[b]  =  A  E, 


(23) 


where  A-1  is  the  22  element  of  (H'H)  !.  It  can  easily  be  shown  that 

r‘={(H'H)-'}  Jht-tf)  =  'i  ■■  (24) 

1  J22  l/=i  y  »(«  -0 

To  establish  distributional  results  the  further  assumption  is  made  that  the  noise  vector,  E,, 

A 

is  m  -variate  normal.  Under  this  assumption,  it  follows  that  the  estimator,  b,  which  is 
some  linear  combination  of  the  E;,  is  also  m  -variate  normal  with  mean  b  and  covariance 

given  by  equation  (24),  i.e.,  b  ~  N(b,  A  *X).  The  residual  matrix,  E,  is  defined  by 

E  =  X-HB  =  [l-  HCH'Hr^'Jx,  (25) 

and  the  residual  sum  of  squares,  E'E,  can  be  shown  to  be  independent  of  b  and  to  be 
distributed  as  a  Wishart  distribution  with  n  —  2  degrees  of  freedom,  i.e., 
E'E  ~  Wm(n  -  2,S)  [Johnson  and  Wichem,  1988],  An  unbiased  estimator  of  E  is  given 

by 

S  =  -^—  E'E  =  — t-  (X-HB)'(X-HB).  (26) 

n- 2  n—2 

We  now  define  the  test  statistic,  T 2 ,  by 

T2=Mn~  2)(b  -  b)'(E'E)-1  (b  -  b)  =  A(b  -  b/S'1  (b  -  b).  (27) 

It  can  be  proved  [Seber,  1984]  that  in  our  case,  i.e.,  b  ~  N( b,A~  E),  E'E  ~  Wm(n- 2,E) 
and  b  independent  of  E'E,  then 

n-2-m  +  l  r2_  ^  F(m  n _2~m  +  l\  (28) 

m  n- 2 

where  F(m,n-2-m  +  \)  is  the  F -distribution  with  m  and  n-2-m  +  l  degrees  of 
freedom. 

To  test  the  hypothesis  Ho:  b  =  0  against  the  alternative  hypothesis  Hi:  b  *  0,  we  assume 
b  =  0  in  equation  (27)  and  compare 


with  the  critical  value  at  the  5%  significance  level  of  the  F -distribution, 
F95(m,n -2- m+l).  The  null  hypothesis  is  rejected  and  the  trend  is  said  to  be  significant 

(at  the  5%  level)  if  F  >  F95(m,n  -2-  m  +  l). 


3.2.  Extending  the  New  Testing  Procedure  to  the  Multivariate  Case 

In  this  subsection  we  generalize  to  new  testing  procedure  introduced  in  section  2.2  to  test 
the  hypothesis  Ho:  b  =  0  in  the  vector  line  plus  noise  model  of  equation  (14).  The  noise 
in  equation  (14)  is  given  by  the  AR (p)  process  satisfying  equation  (15)  which  we  rewrite 

W)E,=U„  (30) 


where  the  mxm  matrix  operator  <E >(B)  is  given  by 

0(5)  =  I  -  <Dj  B  -  0252 ~ O „BP , 


(31) 


with  I  the  mxm  identity  matrix.  If  0(5)  were  known,  equation  (14)  could  be  rewritten 

(32) 


p  \ 

0(5)X,  =  0(l)a  +  1*0,  b  +  0(l)bt  +  0(5)E, 
V;'= 1  J 


—  c  +  df  +  Uf, 


where  c  =  4>(l)a+ 


b,  d  =  <b(l)b,  and  U,  is  vector  white  noise.  If  E,  is  a 

V«=i  J 

stationary  operator,  which  implies  det(0(l))  >  0,  then  d  =  0  if  and  only  if  b  =  0.  To  test 
b  =  0  in  equation  (14)  we  test  d  =  0  in  equation  (32)  using  the  ordinary  multivariate 
linear  regression-based  test  described  in  section  3.1. 

As  in  the  univariate  case,  0(5)  is  not  known  and  must  be  estimated  from  the  sample  data. 


■*.  r  ^  1  ^ 

After  computing  the  least-squares  estimates,  B  =  a'  ;  b' 
the  residuals 


,  as  in  equation  (21)  compute 


E,  =Xf-a-b  t. 


(33) 


These  residuals  do  not  follow  the  same  model  as  E,  and  in  general  are  not  stationary  since 


Ef  —  a  +  bt  +  Ej  -  a  —  hr 
=  (a  -  a)  +  (b  -  b)t  +  E, , 


(34) 
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which  does  not  have  constant  mean  unless  b  =  b.  However,  in  most  cases  we  find  it 
reasonable  to  assume  these  residuals  are  approximately  multivariate  AR(/?),  and  we  let 
cp( B)  denote  the  estimated  autoregressive  operator.  We  transform  the  data  using  0(5)  to 

obtain 


W,  =  0(5)X, 


=  c'  +  dT  +  g,, 


(35) 


where  c' =  0(l)a  + 


p  *  \ 

I/O, 

v«=l  J 


b,  d'  =  0(l)b,  and  g,  =0(5)0  ‘(5)U,,  which  will  not  be 


white  noise  but  should  be  a  reasonably  close  approximation  to  it. 


We  proceed  as  in  the  univariate  case  (assuming  g,  is  white)  and  begin  to  use  standard 

regression  procedures  to  test  for  the  significance  of  d  ,  which  should  be  a  good  estimate  of 
d  if  g,  is  close  to  being  white.  This  estimation  procedure  is  summarized  as  follows: 


1 .  Estimate  a  and  b  using  least  squares. 

2.  Calculate  E,  as  in  (33). 

3.  Find  estimates  of  0(5)  where  0(5)E,  =  U, .  Call  this  estimate  0(5). 

4.  Transform  the  data  to  obtain  W,  =  0(5)X,  =  c'  +  d'r  +  g,  where  g,  is  nearly 
white. 


5.  Calculate  F  as  in  section  3.1  using  the  vector  series  W,  and  assuming 
uncorrelated  residuals.  Compare  F  with  F95  (wi,  w  —  p  —  2  —  m  +  l),  the  critical 
value  based  on  F -distribution,  since  0(5)X,  is  of  length  n  —  p. 


As  in  the  univariate  case,  the  distribution  of  the  test  statistic,  F ,  defined  in  step  5,  is  not 
close  to  an  F-distribution,  yielding  excessive  significance  levels  when  b  =  0,  when  the 
residuals  are  highly  correlated  and  the  series  length  is  small  to  moderate.  The  bootstrap 
approach  used  in  the  univariate  case  may  be  used  in  the  same  way  to  lower  the  significance 
levels. 
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3.2.1.  A  bootstrap  approach 


To  estimate  the  actual  distribution  and  critical  regions  of  the  test  statistic  F  when  b  =  0  in 
model  (14),  we  employ  a  similar  bootstrap  procedure  as  in  the  univariate  case.  In  the 
bootstrap  procedure  we  obtain  F  for  a  sample  data  set  using  steps  1-5  above.  To  simulate 
realizations  under  Ho  we  estimate  in  (14)  assuming  b  =  0  and  denote  this  estimate 
by  O(0)(5).  That  is,  under  Ho  we  assume  that  any  trending  behavior  in  the  series  is  due  to 
correlation  structure  alone.  We  then  obtain  by  simulation  B  realizations  from  the 
autoregressive  model  with  AR  operator  given  by  <J>(0)(5).  For  the  Mi  realization, 
b  =  we  calculate  F*b  as  in  step  5.  The  null  hypothesis  is  rejected  at  the  a  level  of 

a  ^  ^  r  ^  1  ^ 

significance  if  F  >  F]_a  where  Fp  is  the  /Jth  empirical  quantile  of  | F^j  .  Since  the 


probability  that  a  randomly  selected  member  from  the  population  is  greater  than  or  equal  to 
the  yth  largest  value  is  j  /  (B  + 1),  then  by  setting  a  =  j  /  (B  + 1)  it  follows  that  F*_a  is 

the  jth  largest  value  of  ,  e.g.,  if  a  =  0.05  and  B  -  399  then  F*_a  is  the  20th 

l  J  b=\  ^ 


largest  value  of 


i399 

lb=l' 


This  multivariate  testing  procedure  is  basically  equivalent  to  the  univariate  procedure  when 
m  =  1,  because  the  distribution  of  t 2  with  n-  2  degrees  of  freedom  in  the  univariate  case 
with  white  residuals  is  distributed  as  F(l,n  -  2),  which  is  the  same  as  equation  (28)  with 
m  =  1.  As  in  the  univariate  case,  when  the  correlation  (in  time)  in  the  noise  is  high,  i.e., 
det(<I>(l))  is  near  zero,  the  significance  levels  are  still  high  after  the  bootstrap  has  been 
used  to  approximate  the  critical  region.  The  actual  significance  levels  determined  by  the  test 
increase  as  m  increases  for  cases  where  there  is  no  correlation  between  paths  (components 
of  X, ).  It  is  reasonable  to  believe  that  an  adjustment  procedure  similar  to  that  described  in 

section  2.2.2  could  be  used  to  help  correct  the  inflated  significance  levels.  A  difficulty 
arises  in  choosing  the  “median”  model  to  be  employed  in  the  second  application  of  the 
bootstrap,  as  described  in  appendix  A.  This  difficult  issue  must  be  left  for  the  future. 
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APPENDIX  A 


IMPROVED  TESTS  FOR  TREND  IN  TIME  SERIES  DATA 

This  appendix  contains  a  paper  by  Wayne  A.  Woodward,  Steven  Bottone,  and  H.  L.  Gray 
entitled  “Improved  Tests  for  Trend  in  Time  Series  Data”,  which  has  been  submitted  for 
publication  to  the  Journal  of  Time  Series  Analysis. 


IMPROVED  TESTS  FOR  TREND  IN  TIME  SERIES  DATA 

Woodward,  Wayne  A.*,  Bottone,  Steven**,  and  Gray,  H.L.* 

*  Southern  Methodist  University 
**  Mission  Research  Corporation 

ABSTRACT 

The  difficult  problem  of  testing  for  linear  trend  in  the  presence  of  correlated  residuals  is 
addressed.  Because  of  the  correlated  residuals,  tests  for  trend  based  on  the  classical  least- 
squares  regression  techniques  are  inappropriate.  Even  procedures  in  the  literature  that 
adjust  for  the  correlation  in  the  residuals  tend  to  have  the  problem  that  the  observed 
significance  levels  are  higher  than  nominal  levels  for  small  to  moderate  realization  lengths 
whenever  the  residuals  are  highly  correlated.  We  introduce  a  bootstrap-based  procedure 
to  test  for  trend  in  this  setting  which  is  better  adapted  to  controlling  the  significance  levels, 
and  this  testing  procedure  is  studied  via  simulation  results.  The  procedure  is  then  applied 
to  the  problem  of  testing  for  trend  in  global  atmospheric  temperature  data  and  in  data 
from  models  for  ocean  acoustic  travel  time  along  a  path. 


Keywords:  regression  with  correlated  residuals,  bootstrap,  global  warming,  hypothesis 
testing 
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1.  INTRODUCTION 


The  challenging  problem  of  testing  for  trend  in  time  series  data  is  one  which  arises 
in  many  areas  of  application.  This  problem  has  been  addressed  by  many  authors  including 
recent  discussions  by  Brillinger  (1989,  1994),  Woodward  and  Gray  (1993,  1995), 
Bloomfield  and  Nychka  (1992),  Bloomfield  (1992),  and  Harvey  (1989). 

The  specific  problem  we  address  in  this  paper  is  that  of  testing  for  trend  using  the 

model 


Yt=a  +  bt  +  Zt  (1) 

where  Zt  is  a  stationary  autoregressive  process  of  order  p  satisfying  4>{B)Zt  =  at,  where 

at  is  white  noise  and  4>{B)  =  \  —  <f>\B - -  <f>pBp,  where  B  is  the  backshift  operator 

defined  by  BkYt  =  Yt-k-  In  Figure  A-l  we  show  the  global  temperature  series  for  the 
years  1880-1987  as  obtained  by  Hansen  and  LebedefF  (1987,  1988).  It  is  clear  that 
temperatures  have  had  a  tendency  to  rise  over  this  time  span.  When  viewing  this  series,  it 
is  difficult  to  ascertain  whether  the  trend  in  the  series  is  due  to  some  deterministic 
component  such  as  the  bt  term  in  (1)  or  is  simply  due  to  wandering  or  random  trending 
behavior  caused  by  roots  of  4>{r)  =  0  near  (or  equal  to)  one.  Given  a  model  such  as  (1) 
and  a  series  such  as  that  given  in  Figure  A-l,  it  is  clear  that  estimation  procedures  would 
be  expected  to  have  difficulty  "knowing"  whether  to  attribute  apparent  trends  to 
deterministic  components  (i.e.  nonzero  b)  or  to  high  autocorrelation  in  Zt  (i.e.,  roots  of 
4>(B)  =  0  near  one).  When  roots  of  <p(r)  =  0  are  near  unity  the  estimation  procedures 
currently  in  use  tend  to  be  biased  in  favor  of  attributing  random  trending  behavior  to  the 
existence  of  a  nonzero  slope.  Specifically,  when  (f>{r)  =  0  has  a  root  or  roots  near  unity, 
tests  for  trend  assuming  model  (1)  have  a  tendency  to  inflate  the  significance  level  over 


nominal  amounts,  i.e.,  to  attribute  the  trend  behavior  to  the  term  bt.  Park  and  Mitchell 
(1980)  studied  the  case  in  which  the  autoregressive  order  p  =  1  and  used  simulations  to 
examine  a  number  of  estimation  procedures  including  a  maximum  likelihood  method  due 
to  Beach  and  MacKinnon  (1978).  They  concluded  that  all  the  test  procedures  studied  had 
a  tendency  for  the  slope  to  appear  more  significant  than  it  really  is  when  <fr(r)  =  0  has  a 
root  near  one,  i.e.,  when  </>iis  near  one.  The  SAS  AUTOREG  procedure  uses  the  Beach 
and  MacKinnon  ML  estimates  and  quotes  p-values  associated  with  the  test  Hq  :  6  =  0. 
These  p-values  are  based  on  the  assumption  that  the  test  statistic  calculated  is  distributed 
as  t  with  n  -  2  degrees  of  freedom.  Park  and  Mitchell  (1980)  indicate  that  these  types  of 
results  can  be  very  misleading  and  that  when  4>(r)  =  Ohas  roots  near  one,  the  user  should 
consider  using  lower  nominal  significance  levels  to  adjust  for  the  fact  that  actual 
significance  levels  are  higher  than  the  nominal  levels.  A  discussion  of  these  problems  is 
also  given  in  the  SAS/ETS  User's  Manual  (1993).  Woodward  and  Gray  (1993)  found  that 
in  simulated  realizations  of  length  n  =  100  from  (1)  where  <fi(B )  =  1  —  .95 B  and  6  =  0, 
a  significant  trend  is  found  about  35%  of  the  time  using  the  Bloomfield  and  Nychka 
(1993)  test  (using  a  2-sided  test  at  the  nominal  5%  level)  which  adjusts  the  standard  error 
of  the  least  squares  estimator  of  6  to  account  for  the  correlation  structure.  Additionally, 
Brillinger's  (1989)  test  for  a  monotonic  trend  in  a  time  series  found  a  significant  trend  in 
about  50%  of  the  realizations  in  this  case.  Use  of  SAS  to  obtain  ML  estimates  in  this 
same  setting  resulted  in  significant  results  about  25%  of  the  time. 

Based  on  the  results  discussed  above,  we  see  that  the  existing  procedures  can  have 
such  high  true  significance  levels  that  the  finding  of  a  significant  trend  using  these  results 
must  be  viewed  with  extreme  caution.  One  might  even  conclude  that  it  is  often  simply  not 
possible  to  examine  a  series  of  short  to  moderate  length  and  make  an  intelligent  decision 
concerning  whether  an  apparent  trend  is  deterministic  or  random.  In  this  paper  we  help 
make  such  a  decision  more  plausible  by  introducing  a  new  procedure  for  testing  the 
hypothesis  Ho  :  6  =  0  in  (1)  which  is  more  effective  in  controlling  the  actual  significance 


level  even  in  the  presence  of  highly  autocorrelated  errors.  In  Section  2  we  briefly  discuss 
tests  for  trend  and  introduce  our  new  procedure.  In  Section  3  we  discuss  simulation 
studies  that  demonstrate  the  fact  that  the  new  procedure  does  a  much  better  job  of 
maintaining  actual  significance  levels  near  nominal  levels.  Finally,  in  Section  4  we  apply 
the  test  proposed  here  to  some  actual  data. 

2.  A  NEW  TESTING  PROCEDURE 

In  this  section  we  specifically  address  the  problem  of  testing  Hq  :  b  =  0  in  (1) 
against  one  and  two-sided  alternatives.  Some  techniques  proposed  in  the  literature  (e.g. 
Bloomfield  and  Nychka,  1992)  estimate  b  using  usual  least  squares  estimators  and  adjust 
the  standard  error  of  the  least  squares  estimator  to  account  for  the  correlation  structure. 
Others  such  as  the  Beach  and  MacKinnon  (1978)  ML  technique  involve  iterative 
procedures  for  simultaneously  estimating  a,  b ,  and  the  coefficients  in  0(5).  The 
technique  we  propose  uses  the  usual  least  squares  estimates  of  a  and  b.  Notice  that  if 
4>{B)  in  (1)  were  known,  then  we  could  rewrite  (1)  as  follows: 

4>(B)Yt  =  0(1 )a  +  (!>&)  b  +  <KW  +  4>{B)Zt 

=  c  +  dt  +  at  (2) 

where  c  =  0(1)  a  +  an dd  —  0(1)  b  and  where  at  is  white  noise. 

Note  that  if  Zt  is  stationary,  which  implies  0(1)  >  0,  then  d  =  0  if  and  only  if  b  =  0,  and 
d  and  b  have  the  same  sign  when  6^0.  To  test  b  —  0  in  (1)  we  simply  test  d  =  0  in  (2) 
in  which  case  we  are  able  to  use  the  usual  regression-based  standard  errors  since  the 
residuals  are  white. 


In  practice  <f>(B)  is  not  known  so,  for  now,  we  consider  the  following  approach. 

/*S  & 

The  least  squares  estimates  of  a  and  b  (denoted  a  and  b )  are  obtained  and  the  residuals 
from  the  regression  line  are  calculated  using 

Zt  =  Yt  -a-bt. 


These  residuals  do  not  follow  the  same  model  as  Zt  and  are  in  general  not  stationary  since 

Zt  =  a  +  bt  -f*  Zt  —  u  —  bt 
=  (a  —  a)  +  (b  —  b)t  +  Zt 

which  does  not  have  constant  mean  unless  b  =  b.  However,  in  most  cases  we  find  it 
reasonable  to  assume  that  Zt  is  approximately  AR (p),  and  we  let  4>{B)  denote  the 
estimated  autoregressive  operator.  We  transform  the  data  using  to  obtain 

Wt  =  4>(B)Yt 

—  0(l)u  +  +  <f>(l)bt  +  gt 

=  c'  +  d't  +  gt  (4) 

where  c  =  J(l)a  +  »  2  =  ?(l)6,and  wherepf  =4>(B)4>~l(B)at  which 

will  not  be  white  noise  but  should  be  a  reasonably  close  approximation  to  it. 

A  straightforward  application  of  the  procedure  (assuming  gt  is  white)  is  to  use 
standard  regression  procedures  to  test  for  the  significance  of  d.  This  estimation  procedure 
is  summarized  in  the  following: 

(i)  Estimate  a  and  b  using  least  squares. 

/s 

(ii)  Calculate  Zt  as  in  (3). 


A-6 


(iii)  Find  Burg  estimates  of  4>{B)  where  < p(B)Zt  =  at  (see  Burg,  1967  and 
Marple,  1987).  Call  this  estimate  <f>{B). 

(iv)  Transform  the  data  to  obtain  4>{B)Yt  —  c'  +  d't  +  gt  (5) 

where  gt  is  nearly  white  noise. 

/V  /N  -'S  /S 

(v)  Calculate  t  =  d/SE(d)  where  d  and  its  standard  error  are  the  usual 
least  squares-based  quantities  assuming  uncorrelated  residuals.  Compare 

/N  /S 

t  with  t(n  —  p  —  2)  critical  values  based  on  Student's  t  since  <f>(B)Yt  is  of 
length  n  —  p. 

We  considered  a  2-sided  version  of  this  procedure  on  simulated  realizations  of  length  100 
from  the  model  in  (1)  with  <f>(B)  =  1  —  .95B  and  6  =  0.  We  found  that  a  significant 
trend  is  typically  found  over  25%  of  the  time  using  usual  critical  regions  based  on 
Student's  t,  and  so  we  see  that  this  procedure  suffers  from  the  same  problem  of  excessive 
actual  significance  levels  which  occurred  with  the  previously  mentioned  tests. 

It  seems  that  the  primary  reason  for  the  excessive  significance  levels  in  this  case  is 
that  when  <p\  in  1  —  0i  B  is  close  to  1,  4>x  tends  to  be  less  than  0i  so  that  1  —  0t  tends  to 
be  larger  than  1  -  <p\.  This  happens  in  general,  i.e.,  when  4>(B)  has  a  factor  close  to 
1  -  B  then  0(1)  tends  to  be  larger  than  0(1).  In  the  AR(1)  case,  for  example,  it  is  well 
known  that  usual  estimators  (i.e.,OLS,  ML  and  Burg)  for  0i  exhibit  a  bias  toward  zero, 
i.e.,  away  from  the  nonstationary  region  (see  Kang,  1992).  This  bias  is  demonstrated  in 
Table  A-l  where  we  show  the  average  of  the  Burg  estimates  of  0iover  250  replications 
from  a  variety  of  realization  lengths  and  values  of  0i .  For  each  configuration  we  show  the 
average  of  <f>\  estimates  before  and  after  removing  the  least  squares  line.  The  bias  that  has 
been  addressed  by  Kang  is  that  found  before  the  line  is  removed.  There  it  can  be  seen  that 
for  n  =  50  and  n  =  100  this  bias  can  be  substantial.  However,  it  should  also  be  noticed 
that  after  removal  of  the  line,  the  estimates  are  even  more  biased  away  from  the 
nonstationary  region.  This  is  not  surprising  since  the  effect  of  removing  the  line  will  be  to 


use  the  line  to  account  for  some  of  the  behavior  associated  with  the  correlation  structure 
in  the  original  series.  Thus,  in  the  n  =  100  case  with  <f>  =  0.95,  we  see  that 
J(l)  =  1  -  «  0.124  which  is  substantially  larger  than  =  0.05.  Our  simulations 

show  that  transforming  the  series  as  in  (2)  with  the  true  value  of  and  testing 

Ho  :  d  =  0  using  standard  regression-based  test  statistics,  leads  to  a  test  that  appears  to 
have  the  appropriate  significance  level,  while  as  we  have  seen,  transforming  by  <f>(B) 
instead  of  <f>(B),  which  we  of  course  must  do  in  practice,  leads  to  a  test  with  inflated 
significance  levels. 

Examination  of  the  calculated  t  values  using  this  procedure  indicates  that  the 
variability  of  t  is  substantially  larger  than  that  which  would  be  expected  based  on  the 
Student's  t  with  n-p-2  degrees  of  freedom.  In  Table  A-2  we  show  empirical  97.5th 
percentiles  obtained  from  100  simulated  realizations  from  the  model  in  (1)  with  6  =  0  and 
p  =  l  for  several  values  of  n  and  <f>\ .  In  the  table  we  see  that  for  <f>\  near  unity,  these 
percentiles  are  substantially  larger  than  the  percentiles  of  a  Student's  t  with  n  —  p  —  2 
degrees  of  freedom  which  in  the  cases  considered  here  are  slightly  larger  than  1.96.  Thus, 
in  these  cases,  the  distribution  of  t  is  symmetric  about  zero  but  does  not  follow  a  Student's 

XV 

t  with  n-p-2  degrees  of  freedom.  It  is  clear  that  for  a  given  n,  the  variability  of  t 
increases  as  the  true  value  of  <f>\  approaches  +1 . 

(a)  A  bootstrap  approach 

In  order  to  estimate  the  actual  distribution  of  t  when  Yt  follows  the  model  in  (1) 

XV 

with  6  =  0,  we  propose  a  bootstrap  procedure.  In  this  procedure  we  obtain  t  as 
described  previously  in  (5).  To  simulate  realizations  under  Ho  we  estimate  4>{B)  in  (1) 
assuming  6  =  0.  We  denote  this  estimate  by  J(0)  (B).  That  is,  under  Ho  we  assume  that 
any  trending  behavior  in  the  series  is  due  to  the  correlation  structure  alone.  We  then 
obtain  B  realizations  from  the  autoregressive  model  with  AR  operator  given  by  (B). 
For  the  6th  realization,  6  =  1,  ... ,  B  we  calculate?^  as  in  (v).  It  can  be  shown  that  t\ 


does  not  depend  on  the  white  noise  variance,  which  thus  can  be  chosen  arbitrarily  for  the 

bootstrap  replications.  For  the  2-sided  test,  the  null  hypothesis  is  rejected  at  the  a  level  of 
significance  if  t  >t1J¥a/ 2  or  t  <ta*2  where  is  the  /3th  empirical  quantile  of 

{t*b}  .  Because  of  the  symmetric  nature  of  t,  in  practice  we  accomplish  this  test  by 

rejecting  Ho  if  |?|  >  \t\  t  *Q  where  \t\ t  *a  is  the  (1  -  a)th  empirical  quantile  of  { \t  *b  |}  6f x . 
Since  the  probability  a  randomly  selected  member  from  the  population  is  greater  than  or 
equal  to  the  jth  largest  value  is  j/(B  +  1),  then  by  setting  a  =  j/(B  + 1)  it  follows  that 
\t\  1*Q  is  the  jth  largest  value  of  {|t  *  e>  ^ a  =  0.05  and  B  =  399  then  \t\  1*Q  is 

the  20th  largest  values  of  {|<  .  For  a  1 -sided  test,  the  a-level  critical  value  is  the 

✓s  n 

(1  —  a)th  or  ath  empirical  quantile  of  {t*b}b=l  depending  on  whether  the  alternative  is 
Hi  :b  >  0  or  Hi  :b  <  0  respectively. 

(b)  A  second  application  of  the  bootstrap 

It  should  be  noted  that  for  linear  one,  the  observed  significance  levels  are  high. 
This  phenomenon  is  caused  by  the  bias  shown  in  Table  A-l.  That  is,  if  t  is  calculated 
from  data  associated  with  <j>\  =  0.99,  b  =  0,  and  n  =  100,  the  estimate  is  likely  to  be 
about  0.95.  Thus,  the  bootstrap  realizations  are  generated  from  the  AR(1)  model 
(1  —  =  at  with  &  0.95,  and  this  will  result  in t*b  values  that  are  not  as 

variable  as  those  for  the  original  model  with  <p\  =  0.99  as  can  be  seen  in  Table  A-2.  An 
intuitively  appealing  procedure  would  be  to  scale  the  original  t  so  that  it  has  variance 
comparable  to  that  of  the  bootstrap  distribution  for  t  b,  i.e.,  we  obtain  t^j  —  Ct  where 
C  =  a /< 7-j-  and  where  C  would  be  less  than  one  when  4>\  is  near  one.  Clearly,  o^.  can 

be  estimated  from  tb,b  =  l,...,B.  However,  no  comparable  estimate  of  is  available. 
It  is  clear  from  Table  A-l  that  the  estimates  of  the  autoregressive  coefficient  of  the 
bootstrap  realizations  from  (1  —  =  at  will  in  general  underestimate  4> ^ in  much 

the  same  way  that  tends  to  underestimate  4>\.  Let  4>  »  b  =  1,  ... ,  B denote  the 

coefficient  estimates  of  4>  ^  from  the  B  bootstrap  realizations,  and  let  0  *  (m)  denote  the 


median  of  the  coefficient  estimates  obtained  from  these  realizations.  We  generate  a  second 
set  of  bootstrap  realizations  from  the  AR(1)  model  with  coefficient  0 ,  (m),  and  we  denote 

the  t  values  calculated  from  this  second  set  of  bootstrap  realizations  as  t  *b ,  b  =  1,  ... ,  B. 
Since  a?.  /cry  ^  ax,  /  a?,  we  calculate  t^j  =  C  t  where  C  =  ax,  /ax.  and  the  a 's  are 

lb '  1  tb'  lb  J  tb  b 

/N 

✓N  /S 

the  sample  standard  deviations  from  the  B  values  of  t  *b  and  t  * . 

The  concept  of  the  "median"  model  is  not  quite  so  clear-cut  when  an  AR(p)  model 
with  p  >  1  is  used.  Simply  selecting  a  model  whose  coefficients  are  the  corresponding 
median  values  0*(ra),  j  =  1,  ... ,  p  may  result  in  an  AR(p)  model  that  does  not  have  the 

desired  properties  and  may  even  have  roots  inside  the  unit  circle.  Since  the  phenomenon 
of  excessive  significance  levels  is  caused  by  roots  of  0(r)  =  0  close  to  one,  we  take  0(1) 
to  be  our  measure  of  the  extent  to  which  a  fitted  model  has  a  root  near  +1.  In  the  general 
AR(p)  case,  the  "median"  model  is  then  selected  as  the  model  from  the  first  set  of 

^  n 

bootstrap  realizations  associated  with  the  median  value  of  {0^(1)}&=1  where 
0^(1)  =  1  —  <f>x*bj  — - ^p(6)-  This  procedure  is  equivalent  to  the  approach 

described  in  the  preceding  paragraph  when  p  =  1. 

3.  SIMULATION  STUDIES 

In  this  section  we  discuss  the  results  of  simulation  studies  designed  to  examine  the 
performance  of  the  testing  procedures  discussed  in  the  previous  sections.  In  Table  A-3  we 
show  the  observed  significance  levels  from  testing  Ho  :  b  =  0  vs.  Hi  :  b  ^  0  in  (1)  with 
p  =  1  based  on  simulated  realizations  for  a  variety  of  values  of  n  and  0i .  The  testing 
procedures  used  were  the  Beach  and  MacKinnon  (MLE)  (1978)  procedure  using  SAS,  the 
Bloomfield  and  Nychka  (BN)  (1992)  procedure,  the  transformation  (T)  procedure 
discussed  in  (5)  using  Student's  f-based  critical  values,  the  bootstrapped  version  (TB)  of 
the  transformation  procedure,  and  the  "adjusted"  bootstrap  approach  (TBA)  using  the 
second  bootstrap  application.  The  tabled  values  are  the  percentage  of  simulated 
realizations  for  which  a  trend  was  detected.  For  MLE,  BN  and  T  the  results  shown  are 


based  on  250  replications  while  those  for  TB  and  TBA  are  based  on  1000  replications.  In 
the  table  it  can  be  seen  that  for  small  to  moderate  realization  lengths,  the  observed 
significance  levels  for  MLE,  BN,  and  T  is  substantially  higher  than  the  nominal  5%  level 
especially  when  4>i  is  near  +1 .  Only  for  n  >  500  with  <fi\  =  0.8  did  the  actual  observed 
significance  levels  attain  the  nominal  levels.  For  4>\  =  0.95  it  seems  that  these  levels  are 
approaching  5%  as  n  increases,  but  the  5%  level  was  not  attained  by  n  =  1000.  Thus, 
based  on  the  evidence  presented  here  it  is  seen  that  the  MLE,  BN,  and  T  techniques  do 
not  behave  well  for  highly  correlated  residuals  and  small  to  moderate  realization  lengths. 
However,  even  in  the  case  of  highly  correlated  residuals  these  tests  appear  to  behave 
properly  asymptotically. 

Results  for  TB  and  TBA  are  given  for  <f>\  =  0.8,  0.95,  0.99  and  —  0.95.  The 
significance  levels  for  TB  are  much  closer  to  the  nominal  levels,  being  somewhat  too  large 
for  n  <  100  when  4>\  =  0.95  and  for  n  <  500  when  <f>\  =  0.99.  The  significance  levels  of 
6.4%  in  the  table  are  borderline  significantly  too  high  (about  2  SE's  above  5%).  As 
expected,  the  use  of  TBA  improved  the  significance  levels  and  only  in  the  cases  of  n  =  50 
and  500  with  <f>\  =  0.99  were  the  observed  significance  levels  significantly  larger  than  5%. 
It  seems  that  as  <p\  approaches  —  1 ,  no  corresponding  significance  level  problems  arise. 
This  is  reasonable  since  the  inflated  significance  levels  are  attributable  to  apparent  trends 
in  Zt  due  to  roots  near  +1.  It  is  clear  from  the  table  that  for  values  of  </>iwell  removed 
from  +1  and  for  larger  realization  lengths,  significance  levels  for  TB  are  acceptable  and  the 
adjustment  does  not  have  a  substantial  effect. 

In  Table  A-4  we  show  power  results  for  a  variety  of  values  of  <p\  and  slopes  when 
n  =  100  and  p  =  1.  In  order  to  standardize  the  variance  of  the  residual  series,  Zt>  in  all 
cases  the  white  noise  variance  of  the  Zt  series  is  selected  so  that  cr2z  =  a2/(  1  —  )  =  1. 

It  can  be  seen  that  in  the  cases  considered  these  tests  do  have  substantial  power,  especially 
for  lower  values  of  <j)\ .  In  fact  the  trend  is  detected  over  90%  of  the  time  with  TB  and  at 
least  82%  of  the  time  with  TBA  for  <fii  <  0.8  for  the  slopes  considered.  The  use  of  TBA, 


which  does  a  better  job  of  controlling  the  significance  level,  causes  a  dramatic  reduction  in 
power  when  <f>  \  is  near  one  and  the  realization  lengths  are  not  large.  It  is  not  surprising 
that  smaller  power  is  encountered  for  small  to  moderate  slopes  in  the  case  in  which  <f>\  is 
near  +1  due  to  the  confusion  between  random  and  deterministic  trends.  However,  if  slight 
inflation  of  significance  level  (say  to  about  10%)  can  be  tolerated,  then  one  might  consider 
the  use  of  TB  because  of  its  substantially  higher  power  than  TBA  in  these  cases.  It  should 
be  noted  that  when  nominal  significance  levels  for  TB  were  dropped  below  5%  in  order  to 
obtain  actual  significance  levels  of  about  5%  in  these  cases,  the  power  for  TB  was 
comparable  to  that  of  TBA  shown  in  the  table. 

It  should  be  noted  that  all  simulations  discussed  in  this  section  are  concerned  with 
the  case  in  which  Zt  is  AR(1).  We  have  found  these  results  to  be  representative  of  the 
case  p  >  1  if  4>(B)  has  at  most  a  single  factor  near  1  -  B.  In  the  next  section,  in 
conjunction  with  the  analysis  of  actual  data  series,  we  will  examine  simulation  results  for 

p>  1. 


4.  APPLICATION  TO  TREND  TESTING  IN  ACTUAL  DATA 
In  this  section  we  consider  the  application  of  the  tests  proposed  here  to  actual  data 
sets  of  interest.  We  consider  the  Hansen  and  LebedefF  (1987,  1988)  annual  atmospheric 
temperature  series  for  1880-1987  shown  in  Figure  A-l  as  well  as  acoustic  travel  time 
signals.  Because  of  the  small  to  moderate  lengths  of  these  series  the  bootstrap  tests 
recommended  here  will  be  applied. 

(a)  Atmospheric  Temperature  Series 

Woodward  and  Gray  (1993,  1995)  considered  model  (1)  where  Zt  is  modeled  as 
an  AR(8)  as  a  possible  model  for  the  Hansen  and  LebedefF  data.  The  model  they  obtain, 
using  least  squares  estimates  of  a  and  b  and  Burg  estimates  of  the  autoregressive 
parameters  has  a  =  —  0.410,  b  =  0.0055,  with  Zt  modeled  as 
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(6) 


Zt  =  .4943 Zj-j  -  ,0688Zt_2  +  0333Zt_3  + .1253 Zf_4  -  .1035Z,_5  +  .2973 Z<_6 
—  .2305Z*_7  -|-  .1970^_8  "h  dt 


where  a2a  =  .013064.  Woodward  and  Gray  (1995)  display  the  factors  of  the  eighth  order 
characteristic  polynomial  associated  with  the  model  in  (6)  and  show  that  it  has  a  factor  of 
(1  —  .916J3)  indicating  a  root  near  but  not  extremely  close  to  +1.  Bloomfield  and  Nychka 
(1992)  considered  the  test  statistic  given  by  6/SE(6)  where  SE(6)  takes  the  correlation  in 
the  residuals  into  account  and  is  given  by 

SE(6)  =  [2 f  W(f)S(J)df] (7) 

where 

W(f)  =  I  J2bte~2nift  2. 

<=i 

with 

ft  t~1 

°t  =  n - , 

E(^-i)2 

<=1 

/S  —IE  /N  ^  ^ 

i.e.,  b  =  F°r  a  given  realization,  Zt  is  estimated  by  Zt  =  Yt  -  a  -  bt  and  S(f)  is 

t~  l 

an  appropriate  estimate  of  the  spectrum  of  Zt. 

Using  this  procedure  a  test  statistic  of  4.70  was  obtained  which  when  compared  to 
usual  normal  or  Student's  t  critical  values  is  highly  significant  indicating  the  presence  of  a 
trend.  These  results  were  consistent  with  Bloomfield  and  Nychka's  (1992)  analysis  of  the 
temperature  series.  However,  as  pointed  out  here  and  by  Woodward  and  Gray  (1993),  the 
true  significance  levels  of  these  tests  can  be  sufficiently  large  to  make  the  finding  of 
significance  meaningless.  We  used  TB  and  TBA  to  perform  a  1-sided  test  against  the 
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alternative  that  b  >  0.  Use  of  TB  resulted  in  a  test  statistic  of  2.73.  The  1 -sided  critical 
value  t  *5  was  obtained  from  B  =  399  bootstrap  samples  to  be  4.50,  i.e.,  the  test  does  not 
indicate  a  significant  slope.  The  bootstrap-based  p-value  is  14.3%.  For  TBA  we  obtained 
tadj  =  1.95  which  should  be  compared  with  the  same  95%  critical  value  of  4.50  and  this 
again  is  not  significant  at  the  5%  level.  The  bootstrap-based  p-value  in  this  case  is  22.3%. 

In  order  to  examine  the  observed  significance  level  in  this  case  we  fit  a  stationary 
AR(8)  model  to  the  temperature  data  without  first  removing  the  line.  The  model  obtained 
is  the  AR(8)  model 

Yt  =  .5547Ft_!  -  .0421^-2  +  .0669Yt-3  +  .1569^-4  ~  0919^-5  +  .322iyt_6 
-  .232iyt_7  +  .2017yt-8  +  at  (8) 

with  a2  =  0138.  In  this  case  the  characteristic  polynomial  has  a  factor  of  1  -  98B 

a 

indicating  that  the  model  is  very  nearly  nonstationary  with  a  root  near  +1.  Thus,  the 
trending  behavior  is  accounted  for  in  this  model  via  this  near  nonstationarity.  Realizations 
of  length  n  =  100,  150  and  200  were  generated  from  the  model  in  (8)  and  the  percentage 
of  realizations  for  which  the  tests  found  significant  trends  are  shown  in  Table  A-5(a) 
where  it  can  be  seen  that  the  significance  levels  for  TB  are  around  8%  while  those  for 
TBA  are  close  to  the  nominal  level  of  5%.  Consistent  with  the  simulation  results  of  the 
previous  section,  the  significance  levels  using  BN  were  excessively  high,  with  the 
significance  levels  in  this  case  being  about  25%.  The  power  figures  for  TB  and  TBA  are 
shown  in  Table  A-5(b)  based  on  realizations  generated  from  (6)  which  actually  contains  a 
line.  If  this  were  the  true  model  and  100  years  of  data  were  available,  then  it  can  be  seen 
that  the  trend  will  be  detected  about  76%  of  the  time  with  TB  and  about  50%  of  the  time 
with  TBA.  If  150  or  200  years  of  observations  were  available,  then  it  can  be  seen  that  both 
tests  would  have  a  strong  chance  of  detecting  the  trend.  These  results  certainly  do  not 
indicate  a  trend  in  the  Hansen  and  Lebedeff  data,  and  even  TB  (which  has  an  inflated 


significance  level)  does  not  yield  significance  while  it  has  76%  power  against  alternatives 
such  as  (6)  for  n  =  100. 

A  final  comment  in  this  section  concerns  the  ARIMA(9,1,0)  model  fit  by 
Woodward  and  Gray  (1993)  to  the  temperature  series.  Woodward  and  Gray  showed  that 
about  67%  of  realizations  of  length  n  =  108  from  this  model  were  seen  to  have  a 
significant  trend  using  a  2-sided  version  of  Bloomfield  and  Nychka's  (1992)  test  at  the  5% 
nominal  significance  level.  For  comparison  with  the  results  in  Table  A-5,  it  should  be 
noted  that  this  figure  reduces  to  approximately  35%  when  a  1-sided  test  is  used.  In  Table 
A-5(c)  we  show  the  percentage  of  realizations  from  this  ARIMA(9,1,0)  model  for  which 
TB  and  TBA  found  significant  slopes.  There  it  can  be  seen  that  TB  and  TBA  found  a 
significant  slope  at  the  nominal  5%  level  about  13%  and  7%  of  the  time  respectively  for 
the  realization  lengths  considered.  It  should  be  realized  that  the  ARIMA(9,1,0)  is  not  a 
special  case  of  (1)  with  6  =  0  since  in  (1)  Zt  is  assumed  to  be  stationary.  Thus,  the 
ARIMA(9,1,0)  model  is  simply  an  alternative  to  (1),  and  we  would  not  expect  the 
percentage  of  realizations  for  which  a  significant  slope  is  (incorrectly)  found  to  be  at  the 
nominal  5%  level.  This  is  the  case  because  of  our  procedure  of  generating  the  bootstrap 
realizations  from  a  stationary  AR(p)  model  fit  to  the  data.  However,  we  see  that  these 
percentages  are  much  closer  to  5%  than  with  the  Bloomfield  and  Nychka  test,  and  thus 
there  is  less  likelihood  of  confusion  between  a  series  with  real  trend  and  one  with  ARIMA- 
type  random  trends  using  the  new  tests.  If  we  included  a  check  for  unit  root  and 
generated  the  bootstrap  realizations  from  a  model  including  a  unit  root  model  if  this  was 
indicated,  then  the  percentage  of  realizations  from  the  ARIMA(9,1,0)  model  for  which  a 
significant  trend  is  detected  should  be  approximately  the  nominal  level. 

(b)  Ocean  Acoustic  Travel  Times 

Since  the  speed  of  sound  in  the  ocean  increases  as  ocean  temperature  increases,  an 
indication  of  the  presence  of  global  warming  is  a  negative  trend  for  the  travel  time  of  an 


acoustic  pulse  along  a  long  fixed  path.  The  MASIG  (Mesoscale  Air-Sea  Interaction 
Group)  model  is  a  reduced  gravity  ocean  model  driven  by  COADS  (Comprehensive 
Ocean-Atmosphere  Data  Set)  winds,  coupled  to  an  equatorial  model  at  its  southern 
boundary  (Pares-Sierra  and  O'Brien,  1989).  A  20  year  simulation  from  this  model, 
assuming  no  warming,  for  acoustic  travel-time  anomaly  along  a  path  from  Hawaii  to  San 
Diego  is  plotted  in  Figure  A-2.  A  question  of  interest  concerns  the  length  of  time  before  a 
warming  (if  it  existed)  would  be  detected  based  on  the  test  for  trend  described  in  Section 
2.  The  data  plotted  in  Figure  A-2  consist  of  monthly  values  for  the  20  years  simulated. 
We  modeled  the  last  160  points  of  the  MASIG  data  as  an  AR(10).  This  model  is  given 
by 

Zt  =  2.1765Zt_i  -  \.6926Zt-2  +  .9443 Zt-3  -  .7730Z«_4  +  .6793 Zt-5  -  .4383Zt_6 

+  .0608^-7  -  .2369Zf_8  +  .4745Zt_9  -  .2024Z*_io  +  at  (9) 

with  cr2a  =  .001196.  There  are  no  positive  real  roots  in  the  characteristic  equation  but 
there  is  a  pair  of  complex  roots  with  small  complex  component  (i.e.,  associated  with  a 
frequency  near  zero).  Our  simulation  consisted  of  generating  realizations  from  (1)  with  Zt 
given  in  (9)  based  on  two  warming  scenarios:  lines  with  slopes  of  —  0.05  and  —0.1 
seconds/year.  These  slopes  correspond  to  increases  in  ocean  temperature  along  the  path 
of  0.005  and  0.01  Celsius  per  year.  In  Table  A-6  we  give  the  results  using  TB  and  TBA 
assuming  5  to  30  years  of  monthly  data.  Again,  the  other  tests  would  not  be  appropriate 
because  of  the  small  realization  lengths  required  in  this  application.  In  the  table  we  also 
consider  the  case  in  which  there  is  no  warming,  i.e.,  6  =  0.  There  it  can  be  seen  that  when 
6  =  0  both  tests  produce  actual  significance  levels  which  did  not  differ  significantly  from 
the  nominal  5%  level,  with  the  only  exception  being  TB  at  5  years. 

For  the  two  warming  scenarios  considered,  it  is  clear  that  warming  will  not  be  able 
to  be  detected  with  high  probability  unless  data  are  collected  for  a  sufficiently  long  period. 
Based  on  the  significance  level  results,  one  would  probably  be  willing  to  use  TB  because 


of  its  higher  power.  Even  in  this  case,  for  warming  to  be  detected  at  least  50%  of  the  time 
in  the  simulations,  at  least  25  years  of  data  are  needed  if  the  slope  is  -  0.1/year,  and  at 
least  30  years  of  data  are  needed  if  the  slope  is  —  0.05/year. 

6.  CONCLUDING  REMARKS 

The  results  presented  here  are  encouraging  and  indicate  that  it  is  possible  to 
construct  a  test  for  trend  with  appropriate  significance  levels  in  the  presence  of  highly 
autocorrelated  noise  when  the  realization  length  is  not  large.  The  results  presented  here 
also  indicate  that  the  test  has  reasonable  power. 

The  bootstrap  procedure  described  here  could  be  used  analogously  to  obtain  tests 

/N  XN 

based  on  the  test  statistic  b/SE(b )  where  b  is  the  least-squares  estimate  of  b  and  where 

/v  /s 

SE(6)  is  given  in  (7).  Alternatively,  the  procedure  could  be  based  on  a  test  statistic 
obtained  using  the  Beach  and  MacKinnon  (1978)  ML  procedure.  In  both  cases,  however, 
the  procedures  are  relatively  computationally  intensive,  making  bootstrapping  less 
practical  than  in  the  current  implementation. 

It  should  be  noted  that  in  this  paper  we  have  assumed  that  the  order  p  of  </>(£?),  the 
autoregressive  model  fit  to  the  estimated  residuals  Zt,  is  the  same  as  the  order  of 
4>  'q\B)  obtained  by  assuming  b  =  0,  i.e.,  the  autoregressive  order  of  the  model  fit  to  the 
Yt  series  itself.  In  fact,  this  need  not  be  the  case  and  the  procedure  described  here  could 
be  modified  to  allow  for  different  orders.  In  fact,  as  indicated  in  Section  5,  the  model  may 
be  allowed  to  have  one  or  more  unit  roots  if  these  are  indicated  as  appropriate  in  the 
modeling  procedure.  In  this  case  the  problem  with  high  significance  levels  when  roots  are 
on  or  very  near  the  unit  circle  may  be  alleviated  without  the  adjustment  technique 
described  here. 

The  procedure  described  here  is  similar  to  that  given  by  Woodward  and 
Gray(1995).  In  that  paper,  a  bootstrap-based  classification  analysis  technique  was 
presented  for  ascertaining  which  of  two  competing  models  produced  realizations  most  like 


the  observed  data.  The  results  obtained  there  do  not  deal  with  the  issue  of  testing  the 
hypothesis  Ho  :  b  =  0  and  controlling  the  probability  of  a  type  one  error.  Their  findings 
related  to  the  temperature  data  were  consistent  with  those  presented  here  in  that  a  trend 
component  was  not  indicated  for  the  temperature  series. 
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Table  A-l.  Burg  Estimates  of  <f>\ Before  and  After  Removing 
the  Least  Squares  Line 

(250  replicates) 

True  4>\ 

.5  .8  .9  .95  .99 

Before 
50  After 
n  Before 

100  After 
Before 
500  After 


.455 

.718 

.818 

.857 

.890 

.421 

.666 

.757 

.784 

.818 

.473 

.765 

.860 

.908 

.943 

.458 

.745 

.833 

.876 

.902 

.499 

.792 

.893 

.942 

.981 

.495 

.788 

.889 

.938 

.974 

O'** 

Table  A-2.  Observed  97.5th  percentiles  of  t  values  calculated  as  in 

5(d) 

based  on  1000  realizations  from  the  model 
Yf  =  bt  -f-  Zf 

where  (1  —  <f>\B)Zt  =  at  and  6  =  0 


n 


50 

100 

150 


<t>i 


0.50 

0.80 

0.95 

0.99 

CnT 

1 

1 

OJ 

*+o* 

2.40 

3.01 

5.12 

7.21 

2.01 

2.15 

2.65 

4.18 

6.98 

1.98 

2.00 

2.26 

2.93 

5.03 

1.98 
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Table  A-3.  Observed  significance  levels  associated  with  tests  % 

for  b  =  0  based  on  the  model 
Yi  =  a  •+•  bt  •+-  Zt 
where  (1  —  faB)Zt  =  a* 

• 

Nominal  level  =  5% 

2-sided  tests 

1000  replications  with  B  =  399  for  TB  and  TBA 

(250  replications  for  MLE,  BN  and  T)  • 


fa  =  .8  _ fa  =  .95 


MLE 

BN 

T 

TB 

TBA 

MLE 

BN 

T 

TB 

TBA 

50 

16.2 

22.0 

18.4 

5.9 

5.3 

36.2 

36.0 

37.2 

10.0 

6.2 

100 

12.0 

16.8 

16.0 

6.4 

5.9 

25.2 

40.0 

28.4 

7.6 

4.3 

250 

7.6 

12.4 

8.0 

3.8 

4.1 

15.6 

16.8 

17.6 

6.4 

5.7 

500 

6.2 

4.8 

5.6 

4.6 

5.1 

10.8 

12.8 

15.2" 

6.4 

6.2 

1000 

5.4 

4.4 

4.8 

4.1 

4.0 

8.4 

6.0 

9.6 

6.1 

6.2 

SE 

1.4 

0.7 

1.4 

0.7  | 

fa  = 

=  .99 

fa  = 

-.95 

TB 

TBA 

TB 

TBA 

50 

14.9 

8.2 

4.5 

4.5 

100 

13.6 

6.1 

5.0 

5.1 

250 

10.5 

6.3 

4.3 

4.9 

500 

8.3 

6.7 

6.0 

6.1 

1000 

5.7 

4.9 

5.4 

5.6 

SE 

0.7  1 
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Table  A-4.  Observed  power  associated  with  TB  and  TBA 
for  n  =  100  and  various  values  of  b  where  the  model  is 
Yt  =  o  +  bt  +  Zf 

where  (1  —  <p\B)Zt  =  at 

Nominal  level  =  5% 

2-sided  tests 

1000  replications  with  B  —  399 
b 

0.05  0.10  0.15 


TB 

TBA 

TB 

TBA 

TB 

TBA 

0.95 

47.3 

26.8 

80.3 

48.3 

93.5 

71.7 

0.90 

58.4 

38.3 

86.4 

58.3 

97.2 

78.7 

0.80 

90.4 

82.0 

99.2 

92.5 

100.0 

97.0 

0.00 

100.0 

100.0 

100.0 

100.0 

100.0 

100.0 

SE 

16  .  1 
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Table  A-5:  Observed  significance  levels  and  powers  for  "no  line"  and  "line"  models  * 

fit  to  the  Hansen  and  LebedelT  temperature  data 

1000  replications  for  TB  and  TBA,  B  =  399 

250  replications  for  BN  # 

Nominal  level  =  5% 

1 -sided  tests 


n 


(a)  Observed  Significance  Level  (b)  Observed  Power 


AR(8) 


BN 

TB 

TBA 

100 

24.4 

8.0 

5.9 

150 

23.6 

9.1 

6.6 

200 

27.6 

7.6 

5.5 

SE 

1.4 

0.7  1 

AR(8)  +  line 


TB 

TBA 

76.2 

50.2 

93.3 

77.6 

99.9 

94.8 

1  g  | 

(c)  Alternative  Nonstationary  Model 


ARIMA(9, 1,0) 


TB 

TBA 

100 

11.2 

6.0 

150 

14.4 

8.9 

200 

12.1 

7.0 

SE 

J 

A-24 


Table  A-6:  Observed  significance  levels  and  powers 
for  AR(10)  model  fit  to  monthly  MASIG  data 


1000  replications,  B  —  399 


Nominal  level  =  5% 


1 -sided  tests 


Years 


Significance 

Level 


TB 

TBA 

5 

7.5 

4.3 

10 

6.0 

4.3 

15 

5.3 

5.1 

20 

4.6 

4.2 

25 

5.2 

5.0 

30 

|  6.0 

5.7 

SE 

0.7  | 

Power 


-  0.05/yr  slope 


TB 

TBA 

!  9.5 

5.3 

10.5 

7.5 

13.7 

11.6 

21.7 

18.4 

31.8 

28.6 

49.5 

45.7 

1.6 


—  0. 1/yr  slope 


TB 

TBA 

9.8 

4.9 

14.4 

8.7 

26.8 

22.7 

45.8 

39.0 

70.5 

62.0 

88.0 

83.7 

1.6 


A-25 


Travel  Time  Anomaly 


A-27 


APPENDIX  B 


THE  TRENDS  SOFTWARE 


B.l.  Introduction 

The  TRENDS  software  performs  four  basic  computations  related  to  detecting  a  trend  in 
time  series  data:  1)  Trend  Detection,  2)  Power  of  the  Test  (probability  of  detecting  a  trend), 
3)  Trend  Stability  (will  the  trend  continue?),  and  4)  Classification  Statistics.  The  trend 
detection  software  runs  in  one  of  two  modes:  automatic  (novice)  and  manual  (expert).  At 
any  stage  in  the  analysis  the  user  may  turn  off  or  on  the  customization  button.  With  the 
customization  button  off  the  program  will  perform  the  calculation  with  no  intermediate 
input  from  the  user.  The  customization  on  mode  allows  the  user  to  check  the  calculations  at 
various  intermediate  stages  and  to  change  inputs  if  desired. 

B .  2 .  Getting  Started 

To  get  started  a  set  of  time  series  data  must  be  loaded.  The  time  series  data  is  assumed  to 
be  equally  spaced  in  time.  The  data  should  be  in  a  file  with  the  first  entry  an  integer  giving 
the  number  of  data  points,  n,  followed  by  a  column  of  n  real  data  values.  Once  loaded,  a 
plot  of  the  data  appears  in  the  main  window  (see  figure  B-l).  A  slider  bar  allows  the  user 
to  choose  any  contiguous  subset  of  the  data  on  which  to  perform  the  subsequent  analysis. 

B.3.  Trend  Detection 

This  option  answers  the  question:  is  there  a  trend  in  the  selected  time  series  (or  subset)? 
For  the  selected  time  series,  the  test  described  in  section  2.2  is  performed  (the  default 
number  of  bootstrap  replications  is  399).  If  the  test  statistic,  t ,  falls  in  the  critical  region 
the  trend  is  significant,  at  the  5%  level,  and  is  said  to  have  been  detected.  When  the 
customization  button  is  on,  the  user  is  shown  the  trend  program  input  dialog  before  it  is 
executed.  At  this  point  the  user  has  the  option  of  changing  any  of  the  input  parameters, 
which  have  their  default  values.  The  program  chooses  an  order  for  the  autoregressive 
process  used  to  model  the  noise.  The  user  has  the  option  of  changing  this  parameter  at  this 
time. 


Time  Series  Plot 


is  there 
a  trend  ? 


How  good 
is  the  test  ? 


Will  it 
continue  ? 


Status:  [ 


Figure  B-l.  Main  window  of  TRENDS  show  input  time  series. 

The  user  may  wish  to  carry  out  this  entire  process  with  no  automation.  After  having 
selected  the  data  (or  subset),  select  “Pre-Processor”  on  the  Transformation  menu.  Remove 
best  fit  line  by  selecting  “Remove  from  Data”  under  Linear  Trend.  The  transformed  data 
with  the  best  fit  line  removed  is  plotted  along  with  the  original  data.  (If  the  plot  is  not  on 
the  screen  it  can  be  selected  from  the  View  menu.)  Estimate  the  order  of  an  autoregressive 
process  (AR)  to  model  this  transformed  series  by  selecting  “AIC  Estimate”  from  the  GW- 
Functions  menu  and  executing.  To  mn  the  trend  program  select  “Trend  Program”  from  the 
program  menu  and  choose  “Single  Data  Set”  and  “Original”  buttons.  One-  or  two-sided 
testing  may  be  chosen  and  the  AR  order  for  the  transformed  data  may  be  changed.  The 
final  result  can  be  viewed  by  selecting  “Current  Session”  from  the  View  menu.  Detailed 
output  may  be  viewed  by  selecting  “Reports”  from  the  view  menu. 

B.4.  Power  of  the  Test  (Probability  of  detecting  a  trend) 

This  option  computes  an  estimate  of  the  power  of  the  test  of  significance  of  trend,  i.e.,  an 
estimate  of  the  probability  that  the  trend  will  be  judged  as  significant.  This  probability  is 
computed  by  generating  many  realizations  (the  default  is  100)  from  a  line  plus  (AR)  noise 
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model  fit  to  the  data  and  calculating  the  percentage  of  those  realizations  with  significant 
trend  using  the  test  procedure  described  in  section  2.2. 

When  the  customization  button  is  on,  the  user  is  shown  the  trend  program  input  dialog 
before  it  is  executed.  At  this  point  the  user  has  the  option  of  changing  any  of  the  input 
parameters,  which  have  their  default  values  (the  default  value  for  the  number  of  realizations 
is  100).  The  model  parameters  may  be  viewed,  and  changed  if  desired,  by  selecting  the 
“Model  Parameters”  button  in  the  Trend  dialog  or  by  selecting  “Parameters”  from  the  View 
menu.  The  series  length  for  the  data  to  be  generated  and  the  number  of  realizations  to  be 
performed  (the  larger  the  number  of  realizations,  the  better  the  statistics  but  the  longer  the 
execution  time)  are  selected  using  the  slider  bars.  With  this  option  one  can  obtain  answers 
to  questions  such  as  how  long  will  it  take  to  detect  a  trend  in  data  similar  to  a  given  data  set 
and  how  large  would  the  trend  have  to  be  in  a  given  data  set  to  be  significant.  For 
example,  to  answer  the  question  how  long  will  it  take  to  detect  a  trend  in  data  similar  to  a 
given  data  set  one  increases  the  length  of  the  realizations  in  the  dialog  box  until  the  desired 
probability  of  detection  is  reached.  One  can  easily  determine  the  probability  of  detection  as 
a  function  of  series  length  by  running  the  program  with  sequentially  increasing  values  of 
the  series  length. 

The  user  may  wish  to  carry  out  this  entire  process  with  no  automation.  After  having 
selected  the  data  (or  subset),  select  “Pre-Processor”  on  the  Transformation  menu.  Remove 
the  best  fit  line  by  selecting  “Remove  from  Data”  under  Linear  Trend.  The  transformed 
data  with  the  best  fit  line  removed  is  plotted  along  with  the  original  data.  (If  the  plot  is  not 
on  the  screen  it  can  be  selected  from  the  View  menu.)  To  model  this  transformed  series  as 
an  autoregressive  process  (AR)  first  estimate  the  order  by  selecting  “AIC  Estimate”  from 
the  GW-Functions  menu  and  executing.  Next  estimate  the  AR  coefficients  by  selecting 
“Estimate  Coeffs”  from  the  GW-Functions  menu.  To  run  the  trend  program  select  “Trend 
Program”  from  the  program  menu  and  choose  “Generate  Data  from  Model”.  Choose  the 
realization  length  and  the  number  of  realizations  using  the  slider  bar.  The  entire  set  of 
model  parameters  may  be  viewed,  and  changed  if  desired,  by  selecting  “Parameters”  from 
the  View  menu.  One-  or  two-sided  testing  may  be  chosen  and  the  AR  order  for  the 
transformed  data  may  be  changed  (it  should  be  equal  to  the  model  order  of  the  data  to  be 
generated).  The  final  result  can  be  viewed  by  selecting  “Current  Session”  from  the  View 
menu.  Detailed  output  may  be  viewed  by  selecting  “Reports”  from  the  view  menu. 
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B.5.  Trend  Stability  (Will  the  trend  continue?) 

This  option  determines  whether  the  trend  detected  as  significant  in  option  1  is  predicted  to 
continue  by  determining  if  the  selected  series  is  best  classified  as  a  line  plus  noise  model  or 
an  ARIMA  model  using  the  approach  described  in  Bottone,  Gray,  and  Woodward  [1995]. 
The  answer  is  either  “the  trend  will  continue”  (line  plus  noise)  or  “the  trend  will  not 
continue”  (ARIMA). 

When  the  customization  button  is  on,  the  user  is  shown  the  bootstrap  program  input  dialog 
before  it  is  executed.  At  this  point  the  user  has  the  option  of  changing  any  of  the  input 
parameters,  which  have  their  default  values  (the  default  number  of  bootstrap  replications  is 
399).  The  program  chooses  an  order  for  the  autoregressive  process  for  the  noise  in  a  line 
plus  noise  model  and  the  order  of  the  ARIMA  process  used  in  the  bootstrap  classification. 
The  user  has  the  option  of  changing  these  parameters  at  this  time.  The  user  may  also 
choose  between  parametric  and  non-parametric  bootstrapping  and  choose  a  white  noise 
variance  classification  or  an  Anderson  classification. 

The  user  may  wish  to  carry  out  this  entire  process  with  no  automation.  After  having 
selected  the  data  (or  subset),  select  “Pre-Processor”  on  the  Transformation  menu.  Remove 
best  fit  line  by  selecting  “Remove  from  Data”  under  Linear  Trend.  The  transformed  data 
with  the  best  fit  line  removed  is  plotted  along  with  the  original  data.  (If  the  plot  is  not  on 
the  screen  it  can  be  selected  from  the  View  menu.)  Estimate  the  order  of  an  autoregressive 
process  (AR)  to  model  this  transformed  series  by  selecting  “AIC  Estimate”  from  the  GW- 
Functions  menu  and  executing.  To  run  the  bootstrap  program  select  “Boot  Program”  from 
the  program  menu  and  choose  “Single  Data  Set”  and  “Original”  buttons.  Parametric  or 
non-parametric  bootstrapping  may  be  chosen  and  the  white  noise  variance  (WNV)  test  or 
the  Anderson  test  may  be  chosen.  The  AR  order  for  the  line  plus  noise  model  and  the 
orders  for  the  ARIMA  model  may  be  changed.  The  final  result  can  be  viewed  by  selecting 
“Current  Session”  from  the  View  menu.  Detailed  output  may  be  viewed  by  selecting 
“Reports”  from  the  view  menu. 

B.6.  Classification  Statistics 

This  option  computes  the  classification  statistics  showing  how  well  the  classification 
procedure  of  option  3  works.  The  result  is  a  2  x  2  matrix  giving  an  estimate  of  the 
probability  that  the  data  will  be  classified  as  line  plus  noise  given  that  it  is  line  plus  noise  in 
the  upper  left  hand  comer.  The  lower  right  hand  entry  is  the  probability  that  the  data  will 
be  classified  as  ARIMA  given  that  it  is  ARIMA.  The  off-diagonal  entries  give  the 


# 


# 
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probability  that  line  plus  noise  is  chosen  when  the  data  actually  comes  from  an  ARIMA 
model  and  vice  versa. 

When  the  customization  button  is  on,  the  user  is  shown  the  boot  program  input  dialog 
before  both  the  first  stage  (A)  and  second  stage  (B)  are  executed.  The  first  stage  computes 
the  probabilities  that  realizations  generated  from  a  line  plus  noise  model  fit  to  the  data  is 
classified  as  line  plus  noise  or  ARIMA.  The  second  stage  computes  the  probabilities  that 
realizations  generated  from  an  ARIMA  model  fit  to  the  data  is  classified  as  line  plus  noise 
or  ARIMA.  At  this  point  the  user  has  the  option  of  changing  any  of  the  input  parameters, 
which  have  their  default  values  (the  default  for  the  number  of  realizations  is  100).  The 
model  parameters  may  be  viewed,  and  changed  if  desired,  by  selecting  the  “Model 
Parameters”  button  in  the  Boot  dialog  or  by  selecting  “Parameters”  from  the  View  menu. 
The  series  length  for  the  data  to  be  generated  and  the  number  of  realizations  to  be 
performed  (the  larger  the  number  of  realizations,  the  better  the  statistics  but  the  longer  the 
execution  time)  are  selected  using  the  slider  bars.  The  number  of  bootstrap  replications  can 
also  be  changed  using  the  slider  bar. 

The  user  may  wish  to  carry  out  this  entire  process  with  no  automation: 

A.  After  having  selected  the  data  (or  subset),  select  “Pre-Processor”  on  the  Transformation 
menu.  Remove  best  fit  line  by  selecting  “Remove  from  Data”  under  Linear  Trend.  The 
transformed  data  with  the  best  fit  line  removed  is  plotted  along  with  the  original  data.  (If 
the  plot  is  not  on  the  screen  it  can  be  selected  from  the  View  menu.)  To  model  this 
transformed  series  as  an  autoregressive  process  (AR)  first  estimate  the  order  by  selecting 
“AIC  Estimate”  from  the  GW-Functions  menu  and  execute.  Next  estimate  the  AR 
coefficients  by  selecting  “Estimate  Coeffs”  from  the  GW-Functions  menu.  Run  the 
bootstrap  program  by  selecting  “Boot  Program”  from  the  program  menu  and  choose 
“Generate  Data  from  Model”.  Choose  the  realization  length,  the  number  of  realizations  and 
the  number  of  replications  using  the  slider  bar.  The  entire  set  of  model  parameters  may  be 
viewed,  and  changed  if  desired,  by  selecting  “Parameters”  from  the  View  menu. 
Parametric  or  non-parametric  bootstrapping  may  be  chosen  and  the  white  noise  variance 
(WNV)  test  or  the  Anderson  test  may  be  chosen.  The  AR  orders  for  the  line  plus  noise 
model  and  the  ARIMA  model  fit  to  the  realizations  may  be  changed.  The  final  result  (left 
hand  entries  of  the  classification  matrix)  can  be  viewed  by  selecting  “Current  Session”  from 
the  View  menu.  Detailed  output  may  be  viewed  by  selecting  “Reports”  from  the  view 


B.  Re-enter  the  data.  Select  “operator”  from  the  Transformation  menu  and  apply  the 
operator  (1-B)  (which  is  the  default)  on  the  series.  The  transformed  data  is  plotted  along 
with  the  original  data.  (If  the  plot  is  not  on  the  screen  it  can  be  selected  from  the  View 
menu.)  To  model  this  transformed  series  as  an  autoregressive  process  (AR)  first  estimate 
the  order  by  selecting  “AIC  Estimate”  from  the  GW-Functions  menu  and  executing.  Next 
estimate  the  AR  coefficients  by  selecting  “Estimate  Coeffs”  from  the  GW-Functions  menu. 
Choose  “MULT”  from  the  GW-Functions  menu  and  load  the  AR(p)  operator,  multiply  by 
(1-B)  (the  default)  and  apply.  Run  the  bootstrap  program  by  selecting  “Boot  Program” 
from  the  program  menu  and  choose  “Generate  Data  from  Model”.  Choose  the  realization 
length,  the  number  of  realizations  and  the  number  of  replications  using  the  slider  bar.  The 
entire  set  of  model  parameters  may  be  viewed,  and  changed  if  desired,  by  selecting 
“Parameters”  from  the  View  menu.  Parametric  or  non-parametric  bootstrapping  may  be 
chosen  and  the  white  noise  variance  (WNV)  test  or  the  Anderson  test  may  be  chosen.  The 
AR  orders  for  the  line  plus  noise  model  and  the  ARIMA  model  fit  to  the  realizations  may  be 
changed.  The  final  result  (right  hand  entries  of  the  classification  matrix)  can  be  viewed  by 
selecting  “Current  Session”  from  the  View  menu.  Detailed  output  may  be  viewed  by 
selecting  “Reports”  from  the  view  menu. 

B.7.  Example 

Figure  B-2  shows  the  main  window  after  a  complete  run.  The  time  series  analyzed, 
shown  in  figure  B-2  is  108  monthly  values  (9  years)  of  simulated  data  from  the  GFDL 
model  [Manabe  et  al.,  1991]  representing  the  travel-time  anomaly  along  a  path  from  Hawaii 
to  San  Diego.  The  upper  left  hand  comer  shows  the  result  of  step  1  that  a  trend  is  detected 
in  the  selected  series.  The  power  of  the  test,  which  is  an  estimate  of  the  probability  that  a 
trend  will  be  detected  in  data  similar  to  the  input  data,  appears  as  a  pie  chart  in  the  upper 
right  hand  comer  of  the  main  window,  which  in  this  case  is  96%.  In  other  words,  one 
expects  to  detect  a  trend  in  data  similar  to  the  input  data  96%  of  the  time. 

The  lower  left  hand  comer  shows  that  the  best  eventual  forecast  is  for  the  trend  to  continue 
by  classifying  the  input  time  series  as  line  plus  noise  as  opposed  to  ARIMA.  The 
classification  statistics  appear  as  a  2x2  matrix  in  the  lower  right  hand  corner  of  the  main 
window.  For  this  case,  the  probability  of  classifying  a  time  series  similar  to  the  input 
series  as  line  plus  noise  when  it  really  is  line  plus  noise  is  95%.  The  probability  of 
classifying  the  time  series  as  ARIMA  when  it  really  is  ARIMA  is  66%.  The  off-diagonal 
entries  give  the  probability  of  classifying  the  time  series  as  ARIMA  when  it  is  actually  line 
plus  noise  and  vice  versa. 
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Figure  B-2.  Main  window  of  TRENDS  after  complete  run. 

B.8.  CD  ROM 

A  multimedia  tutorial  about  the  use  of  statistical  methods  in  the  ARPA  Acoustic  Monitoring 
of  Global  Ocean  Climate  (AMGOC)  program  has  been  developed  by  Mission  Research 
Corporation.  It  is  supplied  on  CD  ROM  and  runs  with  a  supplied  viewer  on  both 
Macintosh  and  PC  Compatible  computers.  It  incorporates  advanced  user  navigation 
techniques  to  allow  exploration  of  information  in  visual  form  by  providing  access  to  several 
layers  of  material  at  successive  levels  of  detail.  As  such,  it  is  intended  to  guide  the  user 
through  an  introduction  to  the  background  of  the  overall  program,  an  elementary 
understanding  of  the  statistical  methods  employed,  their  importance  to  the  problem  of 
extraction  of  warming  trends  from  ocean  acoustic  data  and  finally  an  example  of  application 
of  the  TRENDS  software.  The  cover  for  the  CD  ROM  for  the  tutorial  is  shown  in 
figure  B-3.  The  implementation  of  this  multimedia  software  includes  access  to  a 
navigation  Help  section,  a  full  Index  of  the  entire  tutorial  and  a  Map  which  allows  the  user 
to  view  the  tree  of  choices  and  also  to  move  instantly  to  any  chosen  page.  Details  about 
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this  CD  ROM  tutorial  can  be  obtained  at  the  internet  address: 
http://chapman.mrcsb.com/OA.html. 
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